Boosting for Text Classification with Subject Headings
نویسندگان
چکیده
s: The aim of this study is to investigate how Medical Subject Headings (MeSH) as background knowledge source can improve text classification results. The hypothesis is experimented with two different sets of medical documents using HMM-based TC classifier. Experimental results show the improvement of the performance with MeSH in accuracy. Résumé : Le but de cette étude est d’examiner comment les vedettes-matière médicales (MeSH) en tant que source de connaissances peuvent améliorer les résultats de la classification de textes. L’hypothèse est vérifiée à l’aide de deux différents ensembles de documents médicaux utilisant la classification textuelle basée sur le MCM. Les résultats de cette expérience montrent une amélioration de la performance de précision avec MeSH.
منابع مشابه
Full-texts representation with Medical Subject Headings, and co-citations network rerank- ing strategies for TREC 2014 Clinical Decision Support Track
In TREC 2014 Clinical Decision Support Track, the task was to retrieve full-texts relevant for answering generic clinical questions about medical records. For this purpose, we investigated a large range of strategies in the five runs we officially submitted. Concerning Information Retrieval (IR), we tested two different indexing levels: documents or sections. Section indexing was clearly below ...
متن کاملCombining Active Learning and Boosting for Naïve Bayes Text Classifiers
This paper presents a variant of the AdaBoost algorithm for boosting Näıve Bayes text classifier, called AdaBUS, which combines active learning with boosting algorithm. Boosting has been evaluated to effectively improve the accuracy of machine-learning based classifiers. However, Näıve Bayes classifier, which is remarkably successful in practice for text classification problems, is known not to...
متن کاملBoosTexter : A Boosting - based System for Text Categorization
This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparing the performance of BoosTexter and a number of other t...
متن کاملA Boosting Algorithm for Classification of Semi-Structured Text
The focus of research in text classification has expanded from simple topic identification to more challenging tasks such as opinion/modality identification. Unfortunately, the latter goals exceed the ability of the traditional bag-of-word representation approach, and a richer, more structural representation is required. Accordingly, learning algorithms must be created that can handle the struc...
متن کاملTreeBoost.MH: A Boosting Algorithm for Multi-label Hierarchical Text Categorization
Hierarchical Text Categorization (HTC) is the task of generating (usually by means of supervised learning algorithms) text classifiers that operate on hierarchically structured classification schemes. Notwithstanding the fact that most largesized classification schemes for text have a hierarchical structure, so far the attention of text classification researchers has mostly focused on algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006